
Specification Description 

DOCUMENT IMAGE PROCESSOR, METHOD FOR EXTRACTING 
DOCUMENT UTILE, AND METHOD FOR IMPARTING DOCUMENT 

TAG INFORMATION 



Technical Field Of The Invention 

This invention relates to a document image processor and il 
document image processing method for storing and managing document 
10 images as image data, more specifically relates to the apparatus and the a_ 
method for extracting title regions and marks attached by a user from a 
document image to use them as document tag information. 

Tho Prior Background of Art 
15 Together with tho incroaoo an improvement of capability of data 

storage, document image processors rapidly become popular in which a- 
paper document s document read from a scanner and etc. e&e is_stored and 
managed as ^document images image that aa?e is_image data of the 
document . 

20 It is arranged in such document image processor that each 

document image is registered corresponding to character strings that are 
document tag information like a keyword or a title A so aG in order to 
search a desired document image from plural document images stored in a 
data storage. 

25 Fig. 19 shows the document tag information conceptually. As 
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shown in the drawing, the document tag information, such as 
"Confidential" 191, "A-company" 192, 'Tear 1999" 193 and "New car" 194, 
for example, acts as a keyword for a document image 190. Provided that a 
plurality of document tag information is attached to respective each 
5 document image like in this way, it is possible to search the a_desired 
document image by narrowing down from limiting those plural document 
tag information. 

So far, In a conventional way , a user has inputted by hand those 
document tag information at the storing of the document image. However, 

10 s ince it i s a When the user that should input the document tag information, 
however, if the moro the number of documents incrcaso increases , the- 
largor the labor volume workload becomes large gets. Accordingly, such 
inputting operation is quite impractical. So in recent years the other 
apparatus has else appeared that permits is able to recognize 

15 characters on the document image, handle a recognized character 
string as document tag information, and then attach the document 
tag information to the document image without assistance of hand 
labor by attaching a character string ao document tag information after 
recognizing characters of the document imago . 

20 For instance, Japanese laid-open publication No. 8-147313 

discloses a method of using a marked mark sheet. In the method, 
first, a user marks checks off a check box of document tag information 
to be attached to a document image, the document tag information which is 
described on a marked the mark sheet in a specific form. Then the 

25 marked The mark sheet is read by a document image processor 
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before the paper document is read, thereby whereby the document tag 
information to be attached can be specified from the nominees of 
document tag information registered in advance. By the The method— 
without using does not require a use of an input device such as a 
5 keyboard or a pointing device, and it is possible to attach the 

document tag information automatically to the document image to be 
registered. 

Incidentally, it is very important for an effective searching of 
document images to give appropriate document tag information to the 

10 document images them. Specifically, it4s a general searching method that 
specifics is to specify the document tag information corresponding to 
the a desired document image from a list of plural document tag 
information displayed on a screen dioplay . And in order to specify such 
document tag information quickly at the searching , respective document 

15 tag information should express the contents of the document directly. 

The Japanese Laid-open Publication No. 8*202859 proposes 
discloses other another method wherein a region to which including a 
title -character string belongs (which is called a "title region" hereafter) is 
extracted from a document image, and the characters arc recognized in the 

20 image of the title region is recognized , and then as a result of the 
recognition a recognized title -character string gets to be is made to 
document tag information. Since the title-character string represents 
the contents of the document directly, an image data processor 
adopting the title-region extracting method can quickly specify the 

25 document tag information corresponding to the desired document image. 
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In tho The above method of extracting the title region , which is 
disclosed in Japanese laid-open publication No. 8-202859, according to is_ 
based on the aspect that the title characters are in the largest size among 
all of characters included in the document image. , it is arranged that, aftor 
5 After dividing the document image into plural regions (a region fitting tho 
to which consecutive character rectangles are combined together )^ and 
calculating the an average of character size in the respective regions, the a 
region in which the average of character size is with the maximum of the 
average character size is extracted as a title region. Accordingly, it is 

10 natural that the title-region extracting method extracts only one the- 
numbcr of the title region extracted by the title-region extracting method 
is only one for a document image. 

However, if there are plural documents with the approximate 
similar contents, the documents always have the similar titles ef- 

15 documents got approximate each other. Therefore, the conventional title- 
region extracting method had a problem that, when there are plural 
documents with approximate similar contents, it is impossible to quickly 
specify the document tag information corresponding to the desired 
document image quickly . 

20 In order to avoid the above problem it may bo arranged there is 

a method without attaching similar titles to documents at the preparing, 
paper documents of a paper document that titles of similar contents should 
not be attached . However, it is not preferable undesirable to request a user 
to do the preparing preparatory operation. 
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On the other hand, the method in Japanese laid-open 
publication No. 8-147313 . which uses uoing a mark sheet A has a very 
troublesome work that it is necessary to define the format form of the mark 
sheet describing all items of the document tag information and to define 
5 the reading method of the mark sheet^when a document image processor is 
configured regarding as the software. Besides , In addition , in case of 
adding and registering nominees of new document tag information later on, 
the items of document tag information have are changed. Thereby, it 
reque s t s is necessary to reconstruct the format form of the mark sheet and 

10 the reading method. 

In addition, in case of using the mark sheet, since the user 
always uses the same sheet to check off the check box, it is hard for a user 
to visually confirm which the document tag information is attached to the 
document image , and there is a trouble it causes that the inputting 

15 mistakes generates frequently. 

The invention is proposed taking the above problems into 
consideration, and has an object to provide the document image processor 
for extracting title regions and marks attached to a document image by the 
user from a document image to use them as document tag information, and 

20 to provide the method for extracting document titles, and the method for 
imparting document tag information. 



SUMMARY OF THE INVENTION Disclosure of the Invention 

The invention adopts the following means in order to achieve 
25 the objects. 
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First of all, a A document image processor, as shown in Fig. 1, 
comprising region dividing means 103 for dividing a document image 
into a plurality of regions, and title-region extracting means 104 for_ 
calculating a first average that is an average of character size in each 
5 region divided by the region dividing means 103. and extracting title 

regions from all the ontiro regions according to a region avorago char actor 
size the first averages calculated per region divided by the region dividing 
moans 103 , the document image processor adopts the following means. 

First of all, after After calculating a total average a second 

10 average character s ize equivalent to an average height of characters 
included in all the entire regions, the title-region extracting means 104 
compares the region first average character size and an extracting 
criterion that is the total second average character size multiplied by an 
extracting parameter, and then extracts^, as a title region^ regions with 

15 the region avorago character size first average larger than the extracting 
criterion. Accordingly, if the region has the region character size first 
average larger than the extracting criterion, the region is extracted as a 
title region. Therefore, it is possible to extract a plurality of title regions 
from a document image. 

20 In addition, the title-region extracting means 104 may be 

arranged to calculate the extracting critorions criteria on a plurality of 
levels by using extract parameters on a plurality of levels. Thereby, the 
extracting judgment can be performed based on respective extracting 
critorions criteria on a plurality of levels, so that it is possible to extract 

25 not only title regions but also subtitle regions (a region including a 



6 



subtitle-character string composed of characters in a little smaller size 

than the size of the title character). 

Further more Furthermore , the title-region extracting means 

104 may determine the extracting parameters on a plurality of levels 
5 based on a value found by dividing a the maximum value of the region 

average character size first average divided by the total character size 

second average. If the extracting parameter is calculated based on the 

maximum value of the region first average character size in s tead of 

without being limited to a fixed value , it is possible to obtain the 
10 extracting criterion criteria more accurately. 

And since the trim average trimmed mean method for 

excluding characters larger than a specific rate proportion of the 

maximum character size and characters smaller than a the specific *ate 

proportion of the minimum character size is used at the time of 
15 calculating to calculate the total second average character size and the 

region first average character size , it is possible to improve the accuracy 

of the extracting further more. 

Moreover, the image of characters included in the extracted title 

region can be converted to a title-character string of a character code string 
20 by character recognizing means 105. The correcting Correcting means 112 

corrects the title -character string; thereby a user can change the title of the 

document image freely. 

Secondary, in the document image processing for preparing and 

storing document images by reading a paper document, reference tag 
25 information storage means 1215 is provided as shown in Fig. 12 for storing 
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the reference tag information ( nominooo a nominee of document tag 
information) together with an attribute value of the reference tag 
information in advance. 

Next, mark extracting means 1205 is provided for extracting a 
specific marks mark attached en to a paper document by a user. The mark 
indicates a general one mark that is imparted in order for a user to identify 
a-the paper document, such as like a stamp, a seal, an illustration, a 
signature of specific handwriting, and etc. 

Calculating means 120A is provided in order to calculate a 
characteristics value representing the characteristics of the mark according 
to the variance of pixels composing the extracted mark. 

Document tag information imparting means 1208 is provided for 
comparing the attribute value and the characteristics value, soloctG 
selecting the reference tag information with the maximum highest degree 
of similarity, and then imparts imparting the selected reference tag 
information to the document image. 

According to the above procedure steps , it is possible to 
automatically impart the document tag information to the document image 
based on the mark that the user uses used in general at the routine work of 
document fifing of user . Therefore, it is possible to perform the invention 
makes it easy to operate the document management in office, and go on at 
the office . 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows a schematic functional block diagram of a 
document image processor in the first embodiment 4 of this invention. 

Fig. 2 shows a flowchart of the title-region extracting process 
5 in the first embodiment 4 of the invention. 

Fig. 3 shows a flowchart of the title-region extracting process 
in the second embodiment 2 of the invention. 

Fig. 4 shows a flowchart of the title- region extracting process 
in the third embodiment 3 of the invention. 
10 Fig. 5 shows an explanatory diagram of the registration 

information management table in the first embodiment 4. 

Fig. 6 shows an explanatory diagram of the registration 
information management table in the second embodiment 2. 

Fig. 7 shows an explanatory diagram of the labeling process . 
15 Fig. 8 shows an explanatory diagram of tho dividing of rogion 

the region dividing process . 

Fig. 9 shows a diagram indicating the correlation among the 
height, the width, and the area of the character rectangle. 

Fig. 10 shows a diagram representing the contents displayed 
20 on a display screen at the searching in the first embodiment 4. 

Fig. 11 shows a diagram representing the contents displayed 
on a display screen at the searching in the second embodiment 2. 

Fig. 12 shows a schematic functional block diagram of a 
document image processor in the fifth and sixth embodiments of the 
25 invention. 
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Fig. 13 shows an explanatory diagram of the registration image 
management table in the fifth and the sixth embodiments of the invention. 

Fig. 14 shows an explanatory diagram of the mark management 
table in the fifth embodiment of the invention. 

Fig. 15 shows an explanatory diagram of the reference tag 
information management table. 

Fig. 16 shows an explanatory diagram regarding the oxtraction 
result of for the extracted mark image. 

Fig. 17 shows an explanatory diagram the registration image 
management table in the sixth embodiment of the invention. 

Fig. 18 shows an explanatory diagram of the mark management 
table in the sixth embodiment of the invention. 

Fig. 19 shows an explanatory diagram expressing the conception 
of the document tag information. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS Best Mode for 
Carrying out the Invention 
(EMBODIMENT l) 

The embodiments of the invention are explained hereafter 
referring to the drawings. The embodiments 1, 2, 3 and 4 are explaining aft 
about a document image data processor for extracting plural titles from a 
paper document. 

Fig. 1 shows a schematic functional block diagram of a_ 
document image processor to which the present invention is applied. The 
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configuration of the apparatus processor will be explained horoaftor 
together with the procedure process of the document image registration. 

First, an image inputting means 101 like a scanner, for 
example, performs the optical/oloctrical converting photoelectric 
5 conversion of a paper document, and then a document image 108a that is 
multi-levels multi-valued image data is obtained. After an image 
processing means 111a adapts the data to the Gtoring performs an 
appropriate processing for the storing (the compressing, for example), 
the document image is registered in a document image area Aa of a_ 

10 storage means 108. It may certainly be arranged that Needless to say , 
it may be configured that the document image processor is not 
provided with the image processing means 111a, but registers the 
multi-lovolo multi-valued image data bo rogiotcrod in the document 
image area Aa without changing change whero the imago processing 

15 moans 111a is not provided . 

The document image inputted to the image processing 
means 111a from the image inputting means 101 is also inputted into 
not only the image processing moans 111a but also to an image 
processing means 111b. Here, the document image is converted to binary 

20 image data A and then stored into an image memory 107. While Referring 
to the document image stored in the image memory 107 storos the 
imago data , the a character rectangle creating means 102 performs the 
following labeling process referring to tho document imago stored in tho 
image memory 107 . The labeling is the a processing , regarding black 

25 pixels among pixels to which for imparting the same label value 
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(identification information) as a notable black pixel a black pixel to 
be given attention (which is called a "target pixel" hereafter) to other 
black pixels of pixels is contiguous on the to 8 directions of the target pixel 
that is^eft the top side, eft the upper right side, eft the right side, eft the 
5 lower right side, eft the down side, eft the lower left side, eft the left side, 
and eft the upper left side, of the target pixel for giving the black pixels the 
same labeling value (identification information) as that of the target pixol. 
That is to say, as shown in Fig. 7, where 8 pixels, Wl, W2, W3, W4, W6, 
W7, W8 and W9 are contiguous with to the target pixel W5, the 

10 character rectangle creating means 102 gives a labeling the label value 
same as that of the target pixel W5 to the black pixels W2, W3 and W8. 
According to such labeling, the same labeling label value can be given per 
a black pixolo contiguous black-pixel-connected component in the 
document image (per a group of continuous black pixels). 

15 Next, the character rectangle creating means 102 prepares a 

character rectangle by cutting off the black pixol contiguous black-pixel- 
connected component attached with the ene same labeling label value, 
and then transfers the character rectangle to region dividing means 103. 
Here, the "character rectangle" means a circumscribed rectangle of a 

20 black pixol contiguous black-pixel-connected component. However, 
there is a case where some characters are a character is not always 
configured by one black pixel contiguous black-pixel-connected 
component. In consideration of this, it can be arranged that a section 
of the black pixel in the document image is expanded before the 

25 labeling. Specifically, it the processing is the process for converting 
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the 8 pixels contiguous to the target pixel to black pixels. The processing is 
repeated by appropriate times (generally twice or triple), thereby the 
section of black pixel is enlarged, and accordingly it is possible to combine 
respective black pixel contiguous black-pixel-connected components^ 
5 which form a character and are apart from each other within a the 
character into one unit. If the labeling is performed after the above- 
mentioned processing, it is possible to prepare the character rectangle per 
character precisely. 

When the character rectangle creating means 102 completes the 

10 processing, the region dividing means 103 detects areas adjacent to 

respective character rectangles, and then divides the document image 
to regions by combining the character rectangles contiguous with each 
other. For instance, the region dividing means 103 tba-t , upon receipt 
of rocoivod the character rectangles CI to C 12 as shown in Fig, 8 A 

15 combines the character rectangles CI to C4, C5 to C9, and CIO to C12 
respective^ and then divides the document image into regions El, E2 
and E3. According to thus region dividing, the document image can be 
divided into regions per character string. The judgment In order to 
judge whether the character rectangles are contiguous with each 

20 other or not, or whether there is a blank line an interlinear blank 
between the character rectangles or not, ohould be dotorminod by using 
proper threshold values of a character gap and a lino space an interlinear 
space may be used to the judgment. 

As a result of the above processing, it is possible to obtain the 

25 information of the total character size the size of all the characters in 
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the document image (which will be described later), the number of the 
divided regions, and the number of the character rectangles in each region. 
It is arranged in the invention that the serial number starting from 1 
be given to each divided region and also be given to each character 
5 rectangle included in the region respectively. Hereinafter, the number 
of character rectangles in the n-ih region is represented by NumChar n , 
and the size of the m-ih character in the /rth region is represented by 
SizeChar n 

Incidentally, as shown in Fig. 9, even if tho character font of the 

10 s ame point is adopted the characters are the same font and the same 

typography point , tho width widths Wl to W4 and tho area areas Al to A4 
for the character rectangle varioo cxtromoly depending depend on a kind of 
the character itself and fluctuate sharply , conversely tho height heights 
HI to H4 of a the character rectangle varioo fluctuates a little. 

15 Therefore, the invention may adopt as the character size "the height of a 
character rectangle" that reflects , on which tho number of points the 
point of a character font is reflected comparatively correctly. 

Here, a title-region extracting means 104 extracts only specific 
regions as a title region only opecific regions from all regions divided as 

20 above. The title-region extracting is explained hereinafter according 
to a flowchart shown in Fig. 2. 

Fist, the title-region extracting means 104 calculates a first 
average a region average character oizc per region (Fig. 2, Step l). The 
region first average character size is an average value of size of aH 

25 characters included in a region. The region first average character eize in 
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the /3" th region, SizeRegn, is found to bo by dividing the sum 
(SizeCharn m) of all the character oizo sizes SizcChar^ m of aH the 
characters included in & the region divided by the number (NumCharn) of 
character characters in the region. This correlation is represented by 
5 the following equation. 
[Equation l] 

SizeRegn = £SizeChar n ,m / NumChar n 

Next, according to the rogion first average character sizo 
SizeRegn por rogion and the number of characters in the region 
10 NumChar n that arc calculated a s abovo , tho total avorago character sizo a_ 
second average, SizeALL, which is an average of character size in the 
document image A is calculated by the following equation (Fig. 2, Step 2). 
[Equation 2] 

SizeALL - E (SizeRegn XNum CharJ /Snum Char n 
15 The method of calculating the rogion first average character 

size SizeRegn and the total second average character size SizeAll is not 
restricted to the above method. For example, it is possible to adopt the 
Trim Average Trimmed Mean Method (a method of calculating the 
average after excluding discarding a specific ratio proportion , for example, 
20 10% of data, from the minimum value eide and the maximum value side), 
which will be explained later. 

Here, according to the judgment whether or not the following 
equation of judging for the extracting judgment is established or not , 
the title-region extracting means 104 performs the extracting judgment of 
25 the title region. 
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[Equation 3] 

SizeRegn > = Size ALL X a 
That is to say, after comparing a value found by multiplying 
the calculated total second average character oizo SizeALL multiplied 
5 by a an extracting parameter a (the-an extracting criterion) and the 
region first average character oizo SizeRegn por region , only the region s 
region where the equation of the extracting judgment is established e$e- 
is_extracted as a title region (Fig. 2, Step 3 to 4 to 5). The extracting 
parameter a should be a constant that is larger than 1.0, and it is 
10 preferable to be 1.2, for example. 

When the extracting judgment is performed for all the regions 
by repeating the above procoduroo steps (Fig. 2, "NO" in Step 3), the title- 
region extracting is completed. Then respective title-region images 108b 
extracted hero as above are registered in a title area Ab of the storage 
15 means 108. 

Next, character recognizing means 105 cuts the title-region 
images extracted from the document image off, performs the character 
recognizing for each title-region image, and then obtains title -character 
strings that are a character code string. Those title-character strings thus 
20 obtained are transferred to display control means 110 via correcting 
means 112, and are presented shown to a user by displaying them in 
a list view on a display screen that is not shown (see Fig. 10(1)). 

The user confirms each title-region image and title-character 
string thus displayed, and if he wants to register one of the title -character 
25 strings in a the same way state as shown on the display screen , he 
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instructs an instruction inputting means 109 to register it. Then, the title- 
character string is transferred from the character recognizing means 105 to 
document registering means 106. 

On the other hand, if the user wants to correct or change any of 
5 the title -character strings, he double-clicks a title -character string thus 
displayed by making use of means of a pointing device of the instruction 
inputting means 109. According to the double -clicking, the correcting 
means 112 instructs the display control means 110 to wink blink the title- 
character string on the display screen and display the cursor within the 

10 character string. Then the user, operating the keyboard of the instruction 
inputting means 109, inputs the-a^corrected character string in the 
correcting means 112, thereby whereby the character string after following 
to the cursor can be replaced with the corrected character string. The By 
inputting the corrected character string thus corrected is inputted to the 

15 character recognizing means 105 from the correcting means 112; thereby ,, 
the title -character string is corrected. Likewise, when the user instructs to 
perform the registration by the instruction inputting means 109^_the 
corrected title-character string is transferred from the character 
recognizing means 105 to the document registering means 106. 

20 However, in case where the confirmation and the correcting are 

not programmed, the contents recognized by the character recognizing 
means 105 is to be transferred to the document registering means 106 as it 
is, without displaying it on the display screen . 

After receiving the title-character string, the document 

25 registering means 106 registers the registration information composed of 
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the storage pointer of the document image 108a and the title-region 
image 108b in the storage means 108, the title -character string, and the 
position and size of the title region in the document image into a 
registration information management table 108c formed in the table area 
5 Ac on the storage means 108 (see Fig. 5). Here, the storage pointer of the 
document image 108a can be obtained from the document image area Aa 
on the storage means 108, the storage pointer of the title image 8b can be 
obtained from the title area Ab on the storage means 108, and the position 
and the size of the title region can be obtained from the character 

10 recognizing means 105. 

After the registration information management table 108c 
is prepared as above, in case where tho instruction inputting moans 
109 inputs the instruction of the searching of the document image is 
inputted by the instruction inputting means 109 such as a keyboard and a 

15 pointing device , the display control means 110 displays in a list view the 
title-region images and the title-character string stored as above on 
the display screen (Fig. 10(1)). 

When the user selects a desired title (a title-region image or 
a tile-character string) from the listed titles on the display screen by 

20 making uso means of the instruction inputting means 109, the 
display control means 110 displays on the display screen the 
document image corresponding to the selected title. At this time, as 
shown in Fig. 10(11), it is preferable that the title region in the document 
image is demonstrated indicated clearly by circumscribing it with a 

25 rectangular frame F. The rectangular frame F can be prepared according to 
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the position and the size of the title region registered in the registration 
information management table 108c. 

In addition to the above method of selecting either one title 
from the list displayed on the display screen , it is needless to say that 
5 the method can bo adopted use a method that, when the user inputs 
specific document tag information is inputted from the instruction 
inputting means 109, if the title corresponding to the specific document 
tag information has been registered in the registration information 
management table 108c, the corresponding document image may be 

10 indicated on the display displayed on the screen . 

In accordance with this embodiment described as above, since 
it is arranged that regions with the region first average charaetor size 
larger than the extracting criterion be extracted as a title region, it is 
possible to extract plural title regions from a document image. Therefore, 

15 even if there are many document images having similar contents, it is 
possible to quickly specify the document tag information (the title) 
corresponding to the desired document image. 

The above explanation does not refer to the procedure steps for 
a case where w hea there is no region in which the extracting judgment 

20 equation is established at the title-region extracting. However, in this 
case, the intention a message that no title region can be extracted is 
displayed on the display screen and the input of a character string to be 
the document tag information is requested to the user. The user inputs 
the character string in response to the request, and then the inputted 
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character string can be used as the title-character string for the document 
image. 

(EMBODIMENT 2) 

It is arranged in the embodiment 1 that, if the regions have the 
5 region first average character oizo larger than the extracting criterion, 
those regions be extracted likewise as a title region regardless of the 
value of the region average character size without discrimination . By 
this method, it is impossible to perform the appropriate display processing 
based on the value of the region average character size the titles by 

10 discriminating between the character sizes , that is to say, the 
processing is such as, where a title character string in a little small 
character size is handled as a subtitle, the subtitle character string is not 
listed up but only the title character string is displayed, the processing of 
displaying in a list view the titlo -character strings without dioplaying in a 

15 list view the subtitle -character otringG composed of characters smaller a 
little than the title character. 

In the embodiment of the invention, the above-mentioned 
problem is settled by arranging to calculate calculating p lural extracting 
critcrions criteria by using plural levels of extracting parameters and to 

20 extract , and then extracting the title regions by corresponding to 
correlating the title regions with the level attributes (information 
indicating the level of the extracting). The configuration will be 
explained hereafter regarding to the points different from that of the 
embodiment 1. 
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The title-region extracting means 104, that calculate d which 
calculates the rogion first average character size SizeRegn and the 
total second average character size SizeAll according to the same 
proccdurc steps as in the embodiment 1, performs the extracting 
5 judgment of plural levels according to the result whether the 

following equation for the extracting judgment on the plural levels is 
established or not. 
[Equation 4] 

SizeRegn > = SizeALL X a p 
10 a p in the above equation is a extracting parameter on the p-th 

level (the level p), and the value of a p should be predetermined so as to 
satisfy the condition of the equation 5. When the extracting judgment is 
performed on 5 levels, it is preferable that each parameter should be 
determined as approximate al=1.5, a2=1.3, a3=1.2, a4=1.15, and 
15 a5=l.l. 

[Equation 5] 

1.0 < a p < < a3 < al < al 

It is explained according to tho The flowchart shown in Fig. 3 is 
explained here . The title-region extracting means 104 performs the 

20 extracting judgment on every level in the order from the level 1 (Fig. 3, 
Stcp Steps 14 to 15 to 14). When the extracting judgment equation is not 
established on cithcr anv level, the region is not extracted as the title 
region but the title-region extracting means 104 performs the extracting 
judgment for the next region (Fig. 3, Stcp Steps 14 to 13 to 14 to 15). On 

25 the other hand, when the extracting judgment equation is established on 
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either one level, the region is extracted as the title region on that level 
( corrcopondin p which is correlated with te the level attribute) and then 
the extracting judgment is performed for the next region (Fig. 3, Ste pSteps 
15 to 16 to 13 to 14 to 15). 
5 After the extracting judgment is performed for every region by 

repeating the above procodurc steBs (Fig. 3, "NO" at Step 13), the title 
region extracting is completed. 

Besides, when there is no region where the extracting judgment 
equation is established, the character string inputted by a user is used as 

10 the title -character string, which is the same as the embodiment 1. The 
level attribute of this title -character string is set as the level 1 and the 
total number of total levels is set as 1. 

In addition, the extracted title -character string can be changed 
or corrected, which is the same as the embodiment 1. 

15 Fig. 6 is an explanatory diagram of the registration information 

management table 1 in this embodiment. It is arranged that the "level 
attribute" field 601 and the "number of tetel levels" field 602 be 
added to the configuration described in the embodiment 1. And when 
there is any region extracted on the level 1 according to of the 5 levels 

20 of the extracting iudgmont iudgments of tho 5 lovolo , the document 

registering means 106 registers "5" on the "number of total levels" field 
602 and "1" on the "level attribute" field 601 respectively. 

Fig. 11 is a diagram showing the contents displayed on the 
displav screen at the searching in this embodiment. It is arranged that 

25 each range of the level attribute of the titles to be displayed in a list 
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view on the upper portion can be selected by the instruction inputting 
means 109. And the display control means 110 displays on the 
displav screen in a list view the titles having the level attribute within 
the scopo range selected as above, referring to the "level attribute" field 
5 601 and the "number of total levels" field 602 in the registration 
information management table 108c. 

In accordance with this embodiment described above, it is 
arranged that the extracting critorion s criteria of plural levels be 
calculated by using the extracting parameters of plural levels and 

10 the title regions be extracted corresponding to each level attribute. 
Therefore it is possible to perform various processing according to 
tho value of tho region average charactor oizc bv discriminating 
between the first averages , for example, for displaying in a list view 
the title-character strings only, without displaying in a list view the 

15 subtitle-character strings. 
(EMBODIMENT 3) 

It is arranged in the embodiment 2 that the extracting 
parameters of plural levels be predetermined (as a fixed value), 
however, it is preferable that the extracting parameters should be 

20 determined according to respective characteristics of the inputted 

document images. It is arranged in this embodiment that the extracting 
parameters of plural levels afebe determined according to the based on a 
value equal to found by dividing the maximum value of the region 
average first averages character oizo divided by the total second average 

25 character size (see Fig. 4, Step 23). The configuration will be explained 
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regarding only the point different from that of the embodiment 2. 

After calculating the rogion first average charactor sizo 
SizeBegn and the tefealsecond average character oizo SizeAll according to 
the same proccdurc steps as in the embodiment 2, the title-region 
extracting means 104 calculates first the value a i that is the maximum 
value of the region average first averages charactor oizo , max {SizeRegJ, 
divided by the total second average charactor size SizeAll, according to 
the following equation. 
[Equation6] 

al = maxfSizeRegn) /SizeAll 

Next, by using the following equation, the title-region 
extracting means 104 determines the extracting parameters a p on each 
level in accordance with thus calculated al and the total number of 
total levels P (P>=l) for the extracting judgment. 
[Equation 7] 

a p = al - (p-1) X (al-l)/P 

For instance, where the extracting judgment is performed on 5 
levels when al is 1.5, the extracting parameters al to a5 on each level 
are calculated as follows. 
[Equation 8] 

al = 1.5-(l-l)x(l.5-l)/5 = 1.5 

a2= 1.5-(2-l)x(l.5-l) / 5 = 1.4 

a3= 1.5-(3-l)x(l.5-l)/5= 1.3 

a4= 1.5-(4-l)x(l.5-l) / 5 = 1.2 

a5= L5-(5-l)x(l.5-l)/5= 1.1 
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According to the equation 7, the extracting parameter a p on 
each level can be determined so as to be equidistance between thus 
calculated a 1 and 1.0. 

The procoduro steps after the above steps is the same as that 
5 of the embodiment 2 excluding the extracting judgment by using the 

extracting paramo tor parameters determined as described above, which 
and those steps are i & not explained here. 

Howovor, tho above method hao an incompetent point that the 
rogion of text io extracted as the title region by mistake when any titlo 
10 region i s not existed in tho document image, In the above method, 

however, where there is no title region in the document image, this is 
because a 1 is becomes a value near to 1.0, for example, 1.03, and a text 
region is extracted as a title region by mistake. In order to avoid such 
trouble, this invention is arranged that so as not to use the extracting 
15 parameter under a specific value 4r©5, for example, 1.5 not bo adopted . 

In addition, when the difference between extracting parameters 
on each level is not more than a specific value, 0.03, for example, the 
extracting judgment cannot be performed precisely. Accordingly, it is 
arranged that the set value of the extracting parameter be corrected so 
20 that the difference between the extract parameters on each level may be 
said specific value (0.03). Practically, in the above case, 0.03 subtracted 
from - cri- Gcqucntially io determined as the respective extracting 
parameters of each lcvol. values found by subtracting 0.03 from the 
respective extracting parameters from a\ to a5 in order are determined 

25 as the extracting parameter. 
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As a result of the above, there is a possibility that the total 
number of total levels P reduces. In this case, the ?eal- actual number of 
levels (which is the reduced number of levels subtracted from the total 
number of total levels P) is registered as the total number of total 
5 levels P on the "number of total levels" field 602 in the registration 
information management table 108c. 

In this embodiment as described above, it is arranged that the 
extracting parameters should not be fixed, but determined based on the 
characteristics of the inputted document image. Therefore it is possible to 

10 perform the extract determination precisely. 
(EMBODIMENT 4) 

In each of the above-mentioned embodiments, the characters 
of the title region in the relatively large size are also taken into the 
calculation of the tota l second average charactor size , and the small 

15 characters such as a comma, a period, and punctuation are also taken 
into that calculation. Thereby the calculation result has an inclination to 
bring down the accuracy of the title extracting. Therefore it is arranged in 
this embodiment that the total second average character size be calculated 
based on the total characters in the document image excluding the 

20 characters of which size is sizes are larger than a specific ratio (90%, for 
example) and characters of which size is sizes are smaller than & specific 
ratio (10%, for example). That is to say, the trim average trimmed mean 
method is adopted here. In addition, even when the rcgion first average 
character sizo is calculated, the same trouble occurs, too. Therefore, 

25 it is possible to apply the trim avorqge trimmed mean method to the 
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calculation of the region character oizo first average . 

As a result, it is possible to calculate the tetalsecond average 
character size and the rcgion first average character size regarding the 
characters excluding a period, a comma and punctuation, so that the 
5 value of the titlo-rcgion extracting calculated averages can bo becomes 
more precise values more than over . 

Here, in each of the proscribed aforementioned embodiments, 
the total second average character size is calculated from the region first 
average character size . But, when the this method is applied to the trim 

10 avcrage trimmed mean method, characters in large size and characters in 
small size are excluded from each title region. For this reason, all the 
characters included in the title region regions cannot should not be 
excluded for the calculation of the total second average character sizo 
bocauoo the oach region omits the largo-sized characters and the email- 

15 sized characters . Therefore, it is arranged in this embodiment that the 

tetalsecond average character sizo be calculated for the total characters in 
the document image once again. 

However, even in case of using the trim average trimmed 
mean method, it is needless to say that either one of the specific value in 

20 the embodiment 1 e& and the level value values in the embodiments 2 and 
3 can be used as the extracting parameter. 

The each explanation of the above embodiments does not refer 
to the number of the original image document documents , but the number 
of the original originals is not limited particularly. That is to say, even 

25 if there is only one sheet or plural sheets, it is possible to obtain the same 
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effect if- as far as the same extracting parameter is used in each page. In 
particular, it is possible in the embodiments 2 and 3 to extract 
accurately the title rogion regions and the sub-title rogion regions from 
one document composed of et plural page pages, such as a data of the 
5 thesis A by using the same extracting parameter in-a to plural page pages . 

According to the above explanations, the height of the character 
rectangle is adopted as the character size, but the width or the area of 
the character rectangle can be adopted as the character size. 

As illustrated in Fig. 1, since the image processing means 111a 

10 and 111b are provided at the stage before the storage means 108 and the 
image memory 107 respectively, it is possible to use a binary image as a 
document image for the title extracting while as well as use a compressed 
image or a multi-valued image as a document image data to be stored in 
the document image area Aa of the storage means 108. Thereby, it is 

15 possible to display in various w wwavs the document images obtained 
as a result of the searching based on the title regions thus extracted, like 
displaying in color. 
(EMBODIMENT 5) 

Here in the embodiments 5 and 6 is explained about the 

20 document image processor that recognizes the marks attached on the paper 
document as the document tag information. 

First, marks such as a title or a keyword are given to any page 
composing the paper document by a user. Here the mark indicates a 
general mark given by a user so as to identify the paper document, like a 

25 stamp, a seal, an illustration, a signature of specific handwriting, and so on. 
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When the document image processor in the invention stores the 
paper document composed of a plurality of pages, it is need necessary to 
judge which page of the paper document the marks are attached to. At this 
time, though there is a method for detecting the marks after the searching 
5 all over the total pages of the paper document, there is a problem in the 
method that it takes much time for the detecting. 

The method to solve said problem is as follows; for instance, the 
document image processor is prodotorminod configured so as to perform tho 
detecting of detect the marks for a mark on the only first page only, in 
10 advance . 

In this embodiment of the invention, the marked pages (which is 
called "document tag information appointing page" hereafter) 21 and 24 
can be distinguished from others by describing the specific 2-dimensional 
code image 26 on the specific position of the lower right, as shown in Fig. 
15 13(b). 

Fig. 12 shows a block diagram of the document image processor 
in the embodiment 5 of the invention. The procedure steps of the processing 
by the document image processor will be explained hereafter. 

First, the image inputting means 1201 electro nizod converts the 

20 paper document to an electronic data by using an optical/electrical 

photoelectric converter such as a scanner, a digital integrated apparatus, 
and so on, and then inputs them the document as the document images. 
Here, as shown in Fig. 13, the document tag information attached to the 
document tag information appointing page 21, "Confidential", "A-company", 

25 and 'Tear '99", should be given to the inputted images 22 and 23, and the 
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document tag information attached to the document tag information 
appointing page 24, "Confidential" and "B-company", should be given to the 
inputted image 25. And the image inputting means 1201 is inputted the 
document tag information appointing page 21, the input image 22, the 
5 input image 23, the document tag information appointing page 24 and the 
input image 25, in those order. 

The inputted document image is stored in the image memory 
1202 temporarily, for which the image data compressing means 1203 
performed the data compressing. After that, said data is stored in the 

10 image storage area 1211 of the storage means 1210. At this time, in order 
to identify each document image thus stored, an image ID is given to the 
document image respectively. The image ID is registered in the "image ID" 
field 121 of the registration image management table 1212 shown in Fig. 
13(a). In addition, the pointer information pointing to the image data 

15 stored in the image storage area 1211 of the storage means 1210 is 

registered in the "pointer to image data" field 122 of the registration image 
management table 1212. 

On the other hand, the document image stored in the image 
memory 1202 is also sent to the mark extracting means 1205 after the 

20 binarization by the binarization means 1204. The mark extracting means 
1205 judges whether the specific two-dimensional code image is at the 
predetermined position of the lower right of the image or not, and 
determines whether the inputted document image is the document tag 
information appointing page or not respectively. 

25 At this time, if the document image is determined as the 
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document tag information appointing page, "1" is registered on the 
"document tag information appointing page flag" field 123 of the 
registration image management table 1212, if not, "0" is registered on it. 
The flag is applied to the identification that the document image is the 
5 document tag information appointing page attached with the marks only 
and does not contains the content as the text of the paper document. For 
instance, after the document tag information are given to the document 
image according to the other- mentioned method, the document image 
corresponding to the document tag information appointing page can be 

10 deleted according to the flag. Thereby it is possible to avoid a waste of the 
memory resources. 

A mark management group No. is given to the entire document 
images inputted between the first document tag information appointing 
page and the next one. In addition, the mark management group No. is 

15 registered on the "mark management group No." field 125 of the 

registration image management table 1212. Here, it means that the 
document image to which the same document tag information is given is 
imparted the same mark management group No. 

The next explanation refers to the processing that the mark 

20 extracting means 1205 extracts the marks from the document image 

determined as the document tag information appointing page according to 
the above processing. 

First, as described in the embodiment 1 it labels the entire 
regions excluding the regions attached with the two-dimensional code 

25 among the document tag information appointing pages. Among a plurality 
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of the black pixels contiguous components obtained by the labeling, the 
components that have the distance from each other smaller than the 
specific threshold value are combined to one region. The regions thus 
obtained are corresponding to the regions of marks 41 to 43 respectively, as 
5 shown in Fig. 16. Those regions are extracted, and thereby each mark 
image can be obtained. 

The number of marks extracted from each document tag 
information appointing page is registered on the "number of marks" field 
124 of the registration image management table 1212. 

10 In addition, in order to manage the information of the extracted 

mark images, each mark image is attached with a mark ID, and then 
registered on the "mark ID" field 131 of the mark management table 1213 
as shown in Fig. 14. The mark management group No. of the document tag 
information appointing page attached with the mark is registered on the 

15 "mark management group No." field 132 of the mark management table 
1213. Regarding each of the mark images extracted from the document tag 
information appointing page, the information about the position and the 
size (the width and the height) of the mark image within the document tag 
information appointing page are registered on the "position" field 134 and 

20 the "size" field 135 of the mark management table 1213 respectively. 

In this embodiment, the document images inputted between the 
first document tag information appointing page and the next one is 
attached with the same mark management group No., and managed as a 
series of document images attendant on the first document tag information 

25 appointing page. It can be considered as another management method that 
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only the specific document image inputted after the document tag 
information appointing page is given the mark management group No. and 
the other document images are not given any number. This method can be 
applied when a user wants to give the table of contents to the specific 
5 document image. 

Next, the characteristics value calculating means 1206 of the 
calculating means 120A calculates the numerical value representing the 
characteristics of the mark image extracted by the mark extracting means 
1205. The invention applies the characteristics value of the Moment 

10 Invariants of the well-known prior arts to this numerical value. The 

following explanation is made regarding the Moment Invariants in brief. 

When the coordinates of a pixel is represented by (i, j) and the 
value of the pixel is represented by I(i, j), I is a function that satisfies 1=1 
for the black pixel meanwhile satisfies 1=0 for the white pixel. The m pq 

15 defined by the Equation 9 is called the (p+q)-dimensional moment. 
[Equation 9] 

m pq = Ei ZjiPj^Ki, j) p,q = 0, 1, 2, ... 

In case of applying the above m pq , the center of gravity (x, y) of 
the two-dimensional image is represented by the Equation 10. 
20 [Equation 10] 

x = mio / moo 
y = moi / moo 

\x pq defined by the Equation 11 according to the center of gravity 
thus calculated is called the center moment. 
25 [Equation 11] 
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Atpq = Si Ij(i-x)p(j-yM(i, j) 

The numerical values Ml to M6 calculated as follows by the 
Equation 12 according to the above center moment are defined as the 
characteristics value of the corresponding two-dimensional image (or on the 
5 Moment Invariants). 
[Equation 12] 

Ml =/I 20+ 11 02 

M2 =(m20-M02) 2 + 4/z ii 2 

M3 =(/! 30 — 3/i 12) 2 +(3jLt 21 — m 03) 2 
10 M4=(/Z 3 0+Ml2) 2 + (/Z21 +M03) 2 

M5 30 — 3 /X 12) (tt 30+ M 12)[(jU 30+ M 12) 2 ~~3(/Z 21+ 11 03) 2 ] 

+ (3// 21"" Ai 03) (/i 21+ Ai 03)[ 3(/i 30+ M 12) 2— (/i 21+ V O3) 2 ] 
M6 =(/Z 20~ 11 02)[(/i 30+ M 12) 2 ~(m 21+ IX 03) 2 ] 

+ 4^ ll(/i 30 + 11 12) (/i 21+ li 03) 

15 Since those characteristics values are unchangeable even in case 

of the rotation or the translation of the two-dimensional image, those 
become the effective value for characterizing the mark like the embodiment 
of the invention when a user gives a specific mark to a papor sheet by hand. 

The characteristics value calculated by the characteristics value 

20 calculating means 1206 is given to the similarity calculating means 1207 of 
the calculating means 120A, where the similarity between these 
characteristics value and the attribute value of respective reference tag 
information is calculated. In order to explain this method, the following is 
the explanation about the management method of the reference tag 

25 information and the calculating method of attribute value of the respective 



34 



reference tag information. 

The reference tag information is the data correlated with the 
mark that a user will use in the future (which is called the "reference 
mark" hereafter), specifically, and the nominee of the document tag 
5 information like the character string playing a role as a keyword for the 
inputted image. The reference tag information is registered on the 
"reference tag information" field 141 of the reference tag information 
management table 1214 as shown in fig. 15(a). The image data of the 
reference mark is stored in the reference tag information storage means 

10 1215. The pointer to this image data is registered on the "pointer to 
reference mark image" field 142 of the reference tag information 
management table 1214. The characteristics value calculating means 1206 
calculates six characteristics values of those reference marks on the 
Moment Invariants, which are registered on the "attribute value (Ml to 

15 M6)" field of the reference tag information management table 1214. That is 
to say, those characteristics values are the attribute values of the 
respective reference marks. 

The distance between the attribute value of the reference mark 
thus calculated and the respective characteristics value on the Moment 

20 Invariants of the mark image extracted from the inputted image is 
calculated by the Equation 13. 
[Equation 13] 

L = (ml-Ml)2 + (m2-M2)2+(m3-M3) 2 + (m4-M4)2 

+ (m5 - M5) 2 + (m6 - M6) 2 
25 The above Ml to M6 represent the attribute value of the 
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reference mark, while the above ml to m6 represent the characteristics 
value of the extracted mark image. It expresses that the smaller the 
distance L calculated by the above equation, the higher the similarity of the 
extracted marked image and the reference tag information is. 
5 The document tag information imparting means 1208 specifies a 

reference mark of which similarity is the maximum value, and selects the 
reference tag information of the reference mark as the document tag 
information of the inputted document image, and then imparts the 
information to the document image. In addition, the document tag 

10 information is registered on the "document tag information" field 133 of the 
mark management table 1213. 

By applying the above-mentioned processing, it is possible to 
automatically impart the document tag information to the inputted 
document image respectively. By using the information of each table 

15 obtained here, the searching of the image can be performed according to the 
following procedure. 

First, when a user selects one of document tag information to be 
used for the searching, the mark management group Nos. corresponding to 
the document tag information can be specified from the mark management 

20 table 1213. Additionally, the image IDs of the document image attached 
with the mark management group No. and the pointer information to the 
document image data can be specified from the registration image 
management table 1212. The document image specified here becomes the 
image correlated with the document tag information designated by the user. 

25 By designating a plurality of document tag information, it is possible to 
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narrow down the image data to be searched. 

When the document tag information with the maximum of the 
similarity calculated by the similarity calculating means 1207 has the 
distance L from the extracted mark image that is larger than the 
5 predetermined threshold value, it is judged that there is no registered 
document tag information to be correlated with this mark image but new 
reference mark is inputted. In this case, the mark image is displayed 
according to the information of the "position" field 134 and the "size" field 
135 of the mark management table 1213 and the "pointer to image data" 

10 field 122 of the registration image management table 1212, and then the 
user is asked to register the document tag information to be correlated with 
the new reference mark. 

The document tag information inputted here is newly registered 
on the "reference tag information" field 141 of the reference tag information 

15 management table 1214. The image data of the reference mark inputted 
newly is stored in the reference tag information storage means 1215 in 
order to apply it to the succeeding researching, meanwhile the pointer 
information to the mark image data is registered on the "pointer to 
reference mark image" field 142 of the reference tag information 

20 management table 1214. In addition, the characteristics value of the new 
reference mark on the Moment Invariants is calculated and then registered 
on the "attribute values (Ml to M6)" field 143 of the reference tag 
information management table 1214. 

As described above, a user executes the input of the new mark 

25 image and the document tag information; thereby the new reference tag 
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information can be registered. 

Beside, in the above explanation, the reference tag information 

correlated with the reference mark is the character string applied to the 

reference mark as shown in Figl4 and Fig. 15(a), but the reference tag 
5 information needs not always to be the character string. That is to say, 

each reference mark can be correlated with any reference tag information 

in the reference tag management table 1214. 

For example, instead of the reference tag information of the 

above-mentioned character string, the thumbnail image of reference mark 
10 is correlated with the reference mark as the reference tag information 

respectively. The thumbnail image is printed on a researching sheet. By 

reading the thumbnail image of the researching sheet by a scanner, the 

desired document image can be searched. 

In order to specify the document tag information appointing 
15 page among the entire inputted images, the two-dimensional code is 

applied as shown in the explanatory diagrams in Fig. 13 and Fig. 16. 

However the one-dimensional code and so on can be used, too. There are 

other methods for specifying the document tag information appointing page 

than the above, that is, a method that a specific mark is used instead of the 
20 two-dimensional code image, a method that a specific colored sheet is used, 

or a method that a specific formed sheet or a specific sized sheet is used. It 

is possible to obtain the same effect by those methods. 

In addition, when the same document tag information is 

imparted to the entire inputted images, it is possible to arrange the 
25 document image processor by defining that only the image to be inputted as 
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the first sheet is the image of the document tag information appointing 
page. In this case, since it has already been known that the image inputted 
as the first sheet is the document tag information appointing page, it needs 
not the processing for specifying the document tag appointing page by the 
5 two-dimensional code image and so on. Therefore, it is possible to simplify 
the processing as the whole. 

It is needless to say that it is possible to extract the mark by 
searching the entire pages of the paper document without using the two- 
dimensional code. At this time, it will happens that characters included in 
10 the paper document, such as the "Confidential", and etc. are extracted as a 
mark in addition to the mark attached by a user. In this case, the 
characters may be added to the mark management table 1213 as one of the 
mark. 

The correlating between the mark image and the reference tag 
15 information is performed by using the characteristics value on the Moment 
Invariants in the above explanation, however it is possible to obtain the 
same effect by the correlating of the templates matching that compares the 
rate of the black pixels matched by overlapping two images. 

Besides, it is possible to correlate a plurality of the reference 
20 marks with a piece of reference tag information. This is carried out by the 
following method; a plurality of the same reference tag information is 
registered on the reference tag information management table 1214, and 
then is correlated with the different reference mark respectively. In this 
case, after the paper document attached with different marks is inputted, 
25 the document images thus inputted is attached with the same document 
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tag information. 

Conversely, one reference mark can be correlated with a 
plurality of reference tag information. This is carried out by the method 
that the different reference tag information in the reference tag 
5 information management table 1214 is correlated with the same reference 
mark. In this case, after the paper document attached with one mark, the 
document image thus inputted is attached with a plurality of the document 
tag information. 
(EMBODIMENT 6) 

10 This embodiment describes the method for imparting the 

document tag information to the document image by extracting the mark 
stamped at the blank part of the paper document to be registered. The 
followings express the points different from that of the embodiment 5 with 
reference to Fig. 12. * 

15 The image inputting means 1201 obtains document images by 

cloctrorrizin g r c lectronicallv converting a paper document inputted by a user, 
like the embodiment 5. As shown in Fig. 17(b), the document tag 
information of "Confidential", "A-company", and "year 99" is attached to the 
document images 31 and 32, while the document tag information of 

20 "Confidential" and "B-company" is attached to the document image 33. In 
order to perform processing, the blank part of each image is stamped with a 
mark correlated with the document tag information to be attached with. 

The image data obtained here is stored in the image memory 
1202 temporarily. And after the data is compressed by the image data 

25 compressing means 1203, it is stored in the image storage area 1211 of the 
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storage means 1210. As the information about the stored image data, the 
necessary information is registered respectively in the "image ID" field 121* 
and the "pointer to image data" field 122' of the registration image 
management table 1212* as shown in Fig. 17(a), which is the same as in the 
5 embodiment 5. 

The image of the image memory 1202 is sent to the mark 
extracting means 1205* after the binarization by the image binarization 
means 1204. In order to extract the region of the mark image precisely, this 
embodiment uses a mark with frame as shown in Fig. 17(b) and performs 

10 the extracting of each mark by the mark extracting means 1205' as follows. 

Each black pixel of the binary image is labeled, and ro gar ding 
each of the black pixolo contiguous components the size of the 
circumscribing rectangle is calculated per black-pixel-connected component . 
At this time, the size of the circumscribing rectangle of the black pixels 

15 contiguous components black-pixel-connected component corresponding to 
the frame portion of the mark is large sufficiently comparing the each 
character size included in the inputted image, but does not get large 
extremely because the mark is stamped within the blank part of the 
document. By applying the characteristics, out of the black pixel contiguous 

20 compononts black-pixel-connected components obtained by the labeling, 

only the region of which the circumscribing rectangle has the size between 
the specified two threshold values is extracted. That is to say, by extracting 
the region of the black pixels contiguous components black-pixel-connected 
component wherein the respective sizes of the height and the width are 

25 larger than the specific threshold value (that is considered as the minimum 
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size of the blank (the height and the width)), and less than another 
threshold value (that is considered as the maximum size of the blank), it is 
possible to extract the region of each mark image. 

The number of marks extracted from the document images by 
5 the above processing is registered respectively on the "number of marks" 
field 124' of the registration image management table 1212'. Each extracted 
mark image is imparted with a mark ID respectively. The mark ID is 
registered in the "mark ID" field 13 T of the mark management table 1213'. 
In addition, the information about the image ID of the inputted image 

10 attached with the mark, the information about the position that the mark 
was attached, and the information about the mark size are registered on 
the "image ID" field 132', the "position" field 134', and the "size" field 135' of 
the mark management table 1213', respectively. 

The embodiment of the invention is arranged so as to impart the 

15 document tag information to the image attached with the mark only. 

Beside, if a user wanto to manage tho imagoo when an image is inputted 
between the image attached with the first mark and that the other image 
with the next mark , if a user wants to manage it as a series of the 
document images included in the image with the first mark, thev can be 

20 managed by the management method for imparting the mark management 
group No. to those images can bo adopted like the embodiment 5. 

Like the embodiment 5, the calculating means 120A (the 
characteristics value calculating means 1206 and the similarity calculating 
means 1207) and the document tag information imparting means 1208 

25 specify the document tag information correlated with the mark image 
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according to the characteristics value of the Moment Invariants of the well- 
known technology. And the specified document tag information is 
registered on the "document tag information" field 133' of the mark 
management table 1213'. 
5 If the above-mentioned processing is adopted, by the simple 

inputting that a mark is stamped on the blank of the paper document to be 
registered, it is possible to imparting the document tag information by the 
automatic searching. In this case, it is not necessary for the document tag 
information appointing page used in the embodiment 5, and the only 

10 document to be registered is inputted. As described above, the registration 
image management table 1212' and the mark management table 1213' are 
configured simply more than that of the registration image management 
table 1212 and the mark management table 1213 in the embodiment 5. 

It is needless to say that this embodiment can adopt the method 

15 for imparting the two-dimensional code to the page attached with a mark 
in order to speed up the mark extracting. 

In the invention of this embodiment, the mark stamped on the 
blank part of the side describing the content of the paper document is 
inputted. However, when a scanner and etc. can permit to scan both sides 

20 of the sheet, the input can be performed by stamping the mark on the 
backside. It can be also expected the same effect. 

In addition, the mark has a frame, but the frame is not always 
necessary. In case of the mark without the frame, since it is considered 
generally that the mark be configured by the black pixels contiguous 

25 components of which size is larger than the characters included in the text 
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of the paper document, it is possible to apply the embodiment. 

As mentioned above, first of all, since the invention is configured 
that the region of which the region first average of the character size is 
larger than the extracting criterion is extracted as the title region, it is 
5 possible to extract a plurality of title regions from one document image. In 
addition, it is possible to perform the extracting judgment on a plurality of 
levels according to the extracting parameters on a plurality of levels. 
Thereby the title region can be determined according to the characteristics 
of the document images inputted with the extracting parameter 

10 parameters on a plurality of levels. Since the trim average trimmed mean 
method for calculating excluding both characters included in the specific 
fate -proportion of the larger side and those included in the specific ^ate 
proportion of the smaller side is applied to the calculation of the total 
second average of the character size and the calculation of the region first 

15 average of the character size, it is possible to improve the precision of the 
extracting. 

Moreover, secondarily, the invention can impart the document 
tag information to the inputted image automatically by inputting the 
marked document to the document image processor without using the 
20 keyboard or the pointing device. By using the document tag information 

attached by the processing, the document image can be searched. Therefore, 
it is possible to manage and utilize the document image processor 
effectively. 



44 



What is claimed is: 

1. (Currently Amended) A document image processor comprising! 

image inputting means for preparing a document image images by 
reading a paper documented,]] 
5 region dividing means for dividing the document image into a 

plurality of regionsLtUl and 

title-region extracting means for calculating first averages as an 
average of character size for characters in each region divided by the 
region dividing means, and then extracting title regions from the entire 
10 respective regions according to the first averages, a region average 

character size equivalent to an average size of characters that is calculated 
per region divided by the region dividing means, 

wherein the title-region extracting means further comprises^ 
means for calculating a second average that is an average of 
15 character size for characters within all the regions; 

means for comparing the first averages and extracting criteria 
found by multiplying the second average by extracting parameters, 
the extracting parameters on a plurality of levels calculated based on a 
value found by dividing a maximum of the first averages by the second 
20 average? and 

means for extracting the regions with the first average larger 
than the extracting criteria, as the title region. 
compare each region average character size and an extracting criterion 
respectively ; the extracting criterion that is a total average character size 
25 multiplied by an extracting parameter ; the total average character size 
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calculated as a valuo equivalent to an average oizo of all charactoro 
included in the entire regions : and cxtracto ao a title rcgiono with the 
region average character oizo larger than the extracting criterion. 



5 2. (Currently Amended) A document image processor according to claim 
1, wherein the title-region extracting means calculates the first averages 
and the second average the region average character size and the 
total average character size based on an average height of characters. 

10 3. (Currently Amended) A document image processor according to claim 
1, wherein the title -region extracting means calculates the first averages 
and the second average the region average character oizo and the 
total average character oizo based on an average width of characters. 

15 4. (Currently Amended) A document image processor according to claim 
1, wherein the title-region extracting means calculates the first averages 
and the second average the region average character size and the 
total average character size based on an average area of characters. 

20 5. (Cancelled) 

6. (Currently Amended) A document image processor according to claim 
1, wherein the means for extracting the regions as the title region 
further extracts each level attribute indicating the level corresponding to 
25 each extracted title region, title-region extracting moano calculates the 
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extracting critorions on a plurality of lcvclo by uoing the extracting 
paramctoro on a plurality of lovclo and oxtracto each title region 
corresponding to each level attribute indicating the level of the 
extracting. 

5 

7. (Cancelled) 

8. (Currently Amended) A document image processor according to claim 
1, wherein the title -region extracting means adopts the trim average 

10 trimmed mean method for discarding a specific proportion of the 

minimum and the maximum values and then computing the means of 
the remaining values, in order to calculate the first averages and the 
second average of character size, calculating the total average character 
oizo and the region average character size according to characters 

15 excluding both characters larger than the specific ratio and 
characters smaller than the specific ratio. 

9. (Currently Amended) A document image processor according to claim 
1, which further comprising correcting means for correcting character 

20 strings of the extracted title regions. 

10. (Cancelled) 

11. (Currently Amended) A document title extracting method for 
25 [[of]] a document image processor comprising^ 
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inputting and an image inputting step of preparing a 
document image images by reading a paper document! 

a dividing step of dividing a plurality of regions from [[a]] the 
document image; 

5 a calculating step of calculating first averages as an average of 

character size for characters in each region a region average 
character size equivalent to the average size of characters per region ; 
and 

a title-region extracting step of extracting title regions region from 
10 the entire respective regions based on the region average character s ize 
according to the first averages, and 

wherein the calculating step comprises a step for calculating a_ 
second average that is an average of character size in all the regions, 

the title-region extracting step comprises a step of comparing 
15 the first averages and extracting criteria found by multiplying the 
second average by extracting parameters, the extracting parameters 
on a plurality of levels calculated based on a value found by dividing a 
maximum of the first averages by the second average; and 

a step of extracting the regions with the first average more 
20 than the extracting criteria, as the title region. 

in which the step of calculating comprises calculating a total 
average character size equivalent to the average size of characters in the 
entire regions, 

and further comprising comparing the region average character size 
25 and a extracting criterion that is the total average character size multiplied 
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by an extracting parameter! and 

in which the atop of extracting title region comprises extracting ao a 
title region regions with the region average character size larger than the 
extracting criterion. 

5 

12. (Currently Amended) A document title extracting method for 
[[of]] a document image processor according to claim 11, in which the 
step of calculating step comprises a step of calculating the first 
averages and the second average region average character size and the 
10 total average character s ize based on an average height of characters. 



13. (Currently Amended) A document title extracting method for 
[[of]] a document image processor according to claim 11, in which the 
stop of calculating step comprises a step of calculating the first 
15 averages and the second average region average character Gizo and the 
total average character size based on an average width of characters. 



14. (Currently Amended) A document title extracting method for 
[[of]] a document image processor according to claim 11, in which the 
20 step of calculating step comprises a step of calculating the first 

averages and the second average region average character size and the 
total average character size based on an average area of characters. 
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15. (Cancelled) 
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16. (Currently Amended) A document title extracting method for a 
document image processor according to claim 11 claim 14 , in which the_ 
step of extracting the regions as the title region further extracts each 
level attribute indicating the level corresponding to each extracted title 
5 region, stop of extracting titlos comprises calculating the extracting 
critorions on a plurality of levels by using the extracting parameters 
on a plurality of level s and extracting each title region corresponding 
to each level attribute indicating the level of the extracting. 

10 17. (Cancelled) 

18. (Currently Amended) A document title extracting method for 
[[of]] a document image processor according to claim 11, in which the 
stop of extracting title title-region extracting step comprises a step of 

15 calculating the first averages and the second average total average 

character size and the region average character size according to the 
trim average trimmed mean method for discarding a specific 
proportion of the minimum and the maximum values and then 
computing the means of the remaining values, that calculates the 

20 average of characters excluding both the characters larger than the 
specific ratio and the characters smaller than the specific ratio. 

19. (Original) A document title extracting method of a document 
image processor according to claim 11, further comprising the step of 

25 correcting character strings of the extracted title regions. 

50 



20. (Cancelled) 



21. (Currently Amended) A rooording computer readable 
5 medium storing a program for performing the steps of for recording 
programs comprising : 

dividing a document image images prepared by reading a 
paper document into a plurality of regions; 

calculating first averages as an average of character size for 
10 characters within each region and a second average that is an 
average of character size in all the regions; 

comparing the first averages and extracting criteria found by 
multiplying the second average by extracting parameters, the 
extracting parameters on a plurality of levels calculated based on a 
15 value found by dividing a maximum of the first averages by the second 
average? and 

extracting the regions with the first average more than the 
extracting criteria, as the title region. 

calculating per region a region average character oizo 
20 equivalent to an average size of characters in a region and a total 

average character size equivalent to an average size of characters in 
the entire regions? 

comparing each region average character size and extracting 
criterion that io the total average character size multiplied by the 
25 extracting parameter; and 



51 



extracting regions with the region average character size 
larger than the extracting criterion as a title region. 



22 - 29. (Cancelled) 

5 

30. (New) A document image processor comprising: 

image inputting means for preparing a document image by reading 
a paper; 

region dividing means for dividing the document image into a 
10 plurality of regions; and 

title-region extracting means for calculating first averages as an 
average of character size for characters within each region divided by 
the region dividing means, and then extracting title regions from the 
respective regions according to the first averages, and 
15 wherein the title -region extracting means further comprises : 

means for calculating a second average that is an average value of 
character size for characters within all the regions; 

means for comparing the first averages and extracting criteria 
found by multiplying the second average by extracting parameters; 
20 and 

means for extracting the regions with the first average more 
than the extracting criteria, as the title region, 

wherein the first averages and the second average of character size 
are calculated based on characters remaining after discarding a specific 
25 proportion of the minimum and the maximum values of the character 
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size. 



31. (New) A document image processor according to claim 30, wherein 
the title-region extracting means calculates the first averages and the 

5 second average based on an average height of characters. 

32. (New) A document image processor according to claim 30, wherein 
the title-region extracting means calculates the first averages and the 
second average based on an average width of characters. 

10 

33. (New) A document image processor according to claim 30, wherein 
the title-region extracting means calculates the first averages and the 
second average based on an average area of characters. 

15 34. (New) A document image processor according to claim 30, further 
comprising a correcting means for correcting character strings of the 
extracted title regions. 

35. (New) A document title extracting method for a document image 
20 processor comprising^ 

an image inputting step of preparing a document image by 
reading a paper! 

a dividing step of dividing a plurality of regions from the document 

image; 

25 a calculating step of calculating first averages as an average of 
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character size for characters within each region; and 

a title-region extracting step of extracting title regions from the 
respective regions according to the first averages, and 

wherein the calculating step comprise a step for calculating a 
5 second average that is an average of character size in all the regions, 

the title-region extracting step comprises a step of comparing 
the first averages and extracting criteria found by multiplying the 
second average by extracting parameters, the extracting parameters 
on a plurality of levels calculated based on a value found by dividing a 
10 maximum of the first averages by the second average; and a step of 
extracting the regions with the first average larger than the 
extracting criteria, as the title region, and 

the first averages and the second average are calculated according 
to the trimmed mean method for discarding a specific proportion of 
15 the minimum and the maximum values and then computing the 
means of the remaining values. 

36. (New) A document title extracting method for a document image 
processor according to claim 35, in which the calculating step comprises 

20 a step of calculating the first averages and the second average based on 
an average height of characters. 

37. (New) A document title extracting method for a document image 
processor according to claim 35, in which the calculating step comprises 

25 a step of calculating the first averages and the second average based on 
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an average width of characters. 

38. (New) A document title extracting method for a document image 
processor according to claim 35, in which the calculating step comprises 

5 a step of calculating the first averages and the second average based on 
an average area of characters. 

39. (New) A document title extracting method of a document image 
processor according to claim 35, further comprising the step of 

10 correcting character strings of the extracted title regions. 

40. (New) A document title ex tracting method of a document image 
processor according to claim 35, wherein the characters of which character 
size are lower than the specific portion are punctuation marks. 

15 

41. (New) A document image processor according to claim 8, wherein 
the characters of which character size are lower than the specific portion 
are punctuation marks. 
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Abstract 

A document image processing device and method for extracting a title 
region and a mark attached by the user from a document image to use 
5 them as document tag information. A region with a region average 
character size larger then a predetermine extraction judging value is 
extracted as a title region by title region extracting means. As a result, 
title regions can be extracted from one document image. A mark that the 
user makes on an input image is extracted by mark extracting means, and 
10 characteristic value of the mark is found by calculating means. Document 
tag information to be imparted to the input image is selected from 
reference tag information according to the characteristic value and the 
attribute value of the reference tag information imparting means. Thus, 
document tag information is automatically imparted to a document image. 

15 
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